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Structure of Alphacoronavirus Transmissible Gastroenteritis Virus 
nspl Has Implications for Coronavirus nspl Function and Evolution 

Anna M. Jansson 

Department of Cell and Molecular Biology, Uppsala University, Biomedical Center, Uppsala, Sweden 

Coronavirus nspl has been shown to induce suppression of host gene expression and to interfere with the host immune re¬ 
sponse. However, the mechanism is currently unknown. The only available structural information on coronavirus nspl is the 
nuclear magnetic resonance (NMR) structure of the N-terminal domain of nspl from severe acute respiratory syndrome corona¬ 
virus (SARS-CoV) from the betacoronavirus genus. Here we present the first nspl structure from an alphacoronavirus, transmis¬ 
sible gastroenteritis virus (TGEV) nspl. It displays a six-stranded (1-barrel fold with a long alpha helix on the rim of the barrel, a 
fold shared with SARS-CoV nspl 13-128 . Contrary to previous speculation, the TGEV nspl structure suggests that coronavirus nspls 
have a common origin, despite the lack of sequence homology. However, comparisons of surface electrostatics, shape, and amino acid 
conservation between the alpha- and betacoronaviruses lead us to speculate that the mechanism for nspl-induced suppression of host 
gene expression might be different in these two genera. 


C oronaviruses (CoVs) cause mainly respiratory and enteric dis¬ 
ease (1). In farm animals, these viruses cause severe disease 
and lead to large economic losses. In humans, CoVs generally 
cause mild symptoms, like the common cold. However, the emer¬ 
gence of severe acute respiratory syndrome (SARS) in 2003 made 
it apparent that CoVs could also cause serious disease in the hu¬ 
man population. CoVs contain a positive, single-stranded RNA 
genome of about 30 kb, which is the largest among RNA viruses (2, 
3). The replicase gene, comprising two-thirds of the genome, en¬ 
codes two large precursor polyproteins that are cleaved into 16 
nonstructural proteins (nsp’s), where nspl is the first to be ex¬ 
pressed (2, 4-7). 

CoVs were originally classified into three groups based on an¬ 
tigenic cross-reactivity (8). Subsequent phylogenetic analysis, in¬ 
cluding analysis of the replicase region, rendered the same three 
clusters with few exceptions. These were called groups 1, 2, and 3 
(9, 10). When SARS coronavirus (SARS-CoV), the etiological 
agent of SARS, was discovered (11-13), it was placed as the only 
member in an early split-off from group 2, in subgroup 2b (10). 
This effectively put the viruses previously established to be mem¬ 
bers of group 2 in subgroup 2a. These groups have now been 
recognized as genera, where groups 1, 2, and 3 have become the 
genera alpha-, beta-, and gammacoronaviruses (a-CoVs, (3-CoVs, 
and y-CoVs), and SARS-CoV is placed in lineage B of the beta 
genus, P-CoV B . Since then, several SARS-like viruses have been 
identified, mainly in bats, and placed in (3-CoV B (14, 15). 

The CoV genome is generally well conserved between the genera. 
The largest differences in the replicase gene can be found in the 5' end, 
and the most N-terminal cleavage product, nspl, is considered one of 
the genus-specific markers (10, 16). This is based on both sequence 
comparisons and the fact that nspl from a-CoV, that from (3-CoV A , 
and that from (3 -CoV b are different in size, —110,250, and 180 amino 
acids, respectively. In contrast to the a-CoVs and (3-CoVs, the 
y-CoVs do not contain an nspl protein (17). The fact that no se¬ 
quence homology could be inferred between the different nspls, or 
any host protein, raised the question of whether these proteins shared 
similar structures and functions (16). However, several studies have 
shown that nspls from both alpha- and betacoronaviruses display 
both differences and similarities. 


It is established that nspl suppresses translation of host 
mRNA. nspls from human CoV-299E, murine hepatitis virus 
(MHV), and SARS-CoV significantly reduce reporter gene ex¬ 
pression in HEK293 cells (18-20). In several cell lines, SARS-CoV 
nspl suppresses host gene expression, including that of type I 
interferon, involved in the host immune response (21). SARS- 
CoV nspl also promotes the degradation of host mRNA (22, 23). 
Like SARS-CoV nspl, transmissible gastroenteritis virus (TGEV) 
nspl can efficiently suppress host mRNA translation, although it 
seems to lack the ability to modify and degrade host mRNAs. 
There are indications that SARS-CoV nspl also suppresses the 
expression of the CoV genes (22), but recent experiments on 
SARS-CoV nspl suggest that a short sequence in the 5' end com¬ 
mon to all CoV mRNAs protects the viral RNA from degradation 
(24, 25). Deletion of nspl from infectious clones of MHV from 
(3 -CoV a abolishes the ability of the virus to infect cultured cells 
(26). A mutation in the cleavage site between nspl and nsp2 in the 
a-CoV transmissible gastroenteritis virus (TGEV), preventing the 
release of nspl, leads to a drastic decrease in the viability of the 
virus (27). 

The observed biochemical effects of nspl highlight the impor¬ 
tance of this protein in the CoV life cycle and its potential role as a 
significant virulence factor, as well as its importance for evasion of 
host responses. This also indicates that nspl is an interesting target 
in the search for new antiviral drugs. The frequent detection of 
SARS-like CoV in mammalian hosts indicates a high risk of rein¬ 
troduction into the human population (14). For development of 
vaccines and antivirals, it is important to understand CoV patho¬ 
genicity and its mechanism for avoiding host antiviral systems. 

This article presents the first high-resolution crystal structure 
of nspl from an alphacoronavirus. To date, the only known struc- 
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ture of nspl from coronaviruses is that of SARS-CoV nspl 13 ^ 128 , 
belonging to the betacoronavirus genus, determined by nuclear 
magnetic resonance (NMR) (28). The structure of TGEV nspl 
reflects the structural and functional similarities and differences 
between a-CoV and (3-CoV. It also suggests that nonstructural 
protein 1 was not acquired independently by the different coro- 
navirus genera. 

MATERIALS AND METHODS 

Cloning, protein expression, and purification. A full-length construct of 
TGEV nsp 1, including an N-terminal His 6 tag, was cloned into the expres¬ 
sion plasmid pDEST14 (Invitrogen). The protein was expressed in Esch¬ 
erichia coli BL21-AI cells (Invitrogen) grown in LB medium at 37°C. 
When the optical density at 600 nm (OD 600 ) reached 0.6, the culture was 
transferred to 25°C and protein expression was induced with L-arabinose 
(2 g/liter). After 3 to 5 h, the cells were harvested by centrifugation. The 
cells were washed in IX SSP buffer (150 mM NaCl, 250 mM NaH 2 P0 4 , 
pH 7.4) prior to storage at —20°C. For protein purification, the cells from 
a 1-liter culture were thawed and resuspended in 20 ml lysis buffer (50 
mM Na 2 HP0 4 , 50 mM Na 2 S0 4 , 100 mM HEPES, 200 mM NaCl, 10 mM 
imidazole, 0.5% Triton X-100, 14 mM (3-mercaptoethanol, pH 8.0) sup¬ 
plemented with 0.01 mg/ml RNase, 0.02 mg/ml DNase, and 0.25 mg/ml 
lysozyme. The cells were subsequently lysed under 2 X 10 5 kPa of pressure 
using a Constant cell disrupter (Constant Systems Ltd.), and the lysate was 
centrifuged at 8°C and 45,000 X g (SS-34 rotor; Sorvall) for 20 min. The 
cleared cell lysate was incubated with 0.5 ml Ni-Sepharose (6 Fast Flow; 
GE Healthcare) for 30 min at 8°C on a shaker. The Ni matrix was washed 
on a column with 20 ml wash buffer (50 mM Na 2 HP0 4 , 50 mM Na 2 S0 4 , 
100 mM HEPES, 200 mM NaCl, 20 mM imidazole, 14 mM P-mercapto- 
ethanol, pH 8.0), and the protein was eluted with 2.5 ml elution buffer 
(same as wash buffer but with 250 mM imidazole). Directly after elution, 
the buffer was exchanged on a PD-10 column (Bio-Rad) and eluted with 
20 mM Tris-HCl, 300 mM NaCl, pH 8.0, 14 mM P-mercaptoethanol. 
Ni-Sepharose purification and buffer exchange were performed at 8°C. 
The protein was further purified by size exclusion chromatography (Hi- 
Load 16/60 Superdex-75; GE Healthcare). The fractions from the peak 
corresponding to a monomer of the TGEV nsp 1 protein were pooled and 
diluted four times with 20 mM Tris-HCl, pH 8.0, to a NaCl concentration 
of 75 mM. The protein was then applied to a 1-ml HiTrapQ anion ex¬ 
change column (GE Healthcare), which was washed with 20 ml of buffer A 
(20 mM Tris-HCl, 75 mM NaCl, pH 8.0, and 14 mM P-mercaptoethanol) 
and eluted with a gradient to buffer B (20 mM Tris-HCl, 1 M NaCl, pH 
8.0, and 14 mM P-mercaptoethanol) over a volume of 20 ml. Both size 
exclusion chromatography and anion-exchange chromatography were 
carried out at 25°C. The TGEV nspl eluted at 500 mM NaCl, and the 
purity of the sample was >98% as judged by analysis using SDS-PAGE. 
The protein sample was diluted with 20 mM Tris-HCl to a NaCl concen¬ 
tration of 150 mM and thereafter concentrated to between 3 to 10 p.g/p.1 in 
a Vivaspin concentrator (Vivascience). 

A second construct with a 5-amino-acid C-terminal truncation and an 
N-terminal His 6 tag was cloned into the expression vector pEXP5 (Invit¬ 
rogen) . Expression and purification were performed as for the full-length 
construct. 

Crystallization. For crystallization screening, drops containing 0.5 p.1 
protein solution and 0.5 p.1 reservoir solution were set up as sitting-drop 
vapor diffusion experiments using an Oryx 4 crystallization robot (Doug¬ 
las Instruments Ltd.). Initial crystal hits were obtained at 20°C under two 
conditions in the JCSG+ suite (Qiagen): A9 (200 mM ammonium chlo¬ 
ride and 20% [wt/vol] polyethylene glycol [PEG] 3350) and H7 (200 mM 
ammonium sulfate, 100 mM Bis-Tris, pH 5.5, and 25% [wt/vol] PEG 
3350). The crystallization conditions were optimized in terms of precipi¬ 
tant, buffer, pH, and protein concentration. Optimal concentrations of 
protein and PEG 4000 were batch dependent. Drops, in volumes varying 
between 3 and 20 pi, containing protein and reservoir solution in a 2:1 
ratio, were set up. The drops were seeded with previously obtained crystals 


30 min after setup. After several rounds of optimization, the best crystals 
were obtained in 5% (wt/vol) PEG 4000,200 mM ammonium chloride, 30 
mM HEPES, and 30 mM morpholineethanesulfonic acid (MES), pH 6.2, 
with a protein concentration of 5 pg/pl. Native crystals were dipped for a 
few seconds in reservoir solution supplemented with 15% glycerol before 
vitrification in liquid nitrogen. Crystals for phasing were soaked for 2 h in 
reservoir solution supplemented with 10 mM K 2 PtCl 4 and thereafter back 
soaked for 30 min in the same solution without K 2 PtCl 4 . The Pt-soaked 
crystals were cryoprotected and vitrified as described above. 

Data collection, phasing, and refinement. Crystallographic data were 
collected at the European Synchrotron Radiation Facility (ESRF), 
Grenoble, France. Native data were collected at beamline ID23eh2, at a 
wavelength of 0.873 A to 1.6 A resolution. Anomalous data were collected 
at beamline ID14eh4. Two 360-degree data sets were collected at the Pt 
edge (X = 1.072 A) at different k angles, with an oscillation angle of 3 
degrees, to a resolution of 2.5 A. The images were indexed and integrated 
in the software program MOSFLM (29) and scaled in the program Scala 
(30, 31). The space group was determined to be PI, with two molecules in 
the asymmetric unit, related by a noncrystallographic 2-fold axis as re¬ 
vealed by a self-rotation function calculated by the software program Mol- 
rep (31, 32). The solvent content was estimated to be 40%, with a Mat¬ 
thews coefficient of 2.08 (31, 33). Four platinum sites were identified by 
single isomorphous replacement with anomalous scattering (SIRAS) us¬ 
ing the software program ShelxD (34). The sites were further refined in 
SHARP (35). Subsequent solvent flattening and histogram matching us¬ 
ing the program DM (36) resulted in a significantly improved electron 
density map. The Buccaneer software program (37) was used to create a 
first trace of the polypeptide backbone. This initial model was further 
improved by alternate cycles of model rebuilding in O (38) and refine¬ 
ment in the Buster-TNT program (39, 40). Final refinement was per¬ 
formed using translation, libration, skew (TLS) refinement with the two 
chains as separate groups. In the N-terminal region of the A chain, an 
additional four residues from the His tag could be modeled. In the B 
chain, density for the two first residues in the N-terminal region was 
missing. For full phasing and refinement statistics, see Table 1. 

Protein structure accession number. The TGEV nsp 1 structure has 
been deposited in the Protein Data Bank under PDB ID 3ZBD. 

RESULTS 

The TGEV nspl structure exhibits an irregular |}-barrel fold. 

Initial crystallization trials were performed with protein from a 
construct expressing full-length TGEV nspl. Crystals were ob¬ 
tained and tested for diffraction, but no data of sufficient quality 
could be collected. The sequence from TGEV nspl was analyzed 
using the secondary structure prediction software programs Phyre 
(41) and I-Tasser (42), and a new construct was produced, con¬ 
taining a 5-amino-acid C-terminal truncation. This new construct 
yielded crystals under the same crystallization conditions as the 
full-length protein, and native data were collected to 1.5 A. Anom¬ 
alous data from platinum soaks were collected to 2.4 A. The struc¬ 
ture was subsequently solved by single isomorphous replacement 
with anomalous scattering using both data sets. Details of phasing 
and refinement are in Table 1. 

The TGEV nspl structure is characterized by an irregular six- 
stranded (3-barrel, flanked by a small (3-sheet connected to a short 
3 10 helix (Fig. 1). A 15-amino-acid-long a-helix is placed on the 
rim of the barrel. Four antiparallel strands, (33, (37, (35, and (36, 
make up one side of the barrel, with (33 and (36 loosely connected 
to strands (31 and (38, which create the other side of the barrel. (32 
and (34 form a small parallel sheet flanking the barrel adjacent to 
(33 and (37. Some of the strands are irregular and have breaks in the 
hydrogen bonding pattern. This is due to a (3-bulge in (37 involv¬ 
ing the carbonyl oxygen from Val64 and amide nitrogens from 
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TABLE 1 Data collection and refinement statistics" 


Statistic 

Value for TGEV nspl 


Data collection 

Native 

Pt 

Beamline 

ID23eh2 (ESRF) 

ID14eh4 (ESRF) 

Wavelength (A) 

0.873 

1.072 

Space group 

PI 

PI 

Cell axial lengths (A) 

35.4, 36.0, 42.2 

35.6 36.1 42.7 

Cell angles (°) 

91.3, 109.1,94.2 

90.9, 109.4, 93.9 

Resolution range (A) 

27.4-1.5 (1.58-1.50) 

40.3-2.5 (2.64-2.5) 

No. of reflections measured 

122,299 (17,500) 

167,992 (24,887) 

No. of unique reflections 

30,811 (4,441) 

6,888 (9,98) 

Avg multiplicity 

4.0 (3.9) 

24.4 (24.9) 

Anomalous avg multiplicity 


12.0 (12.3) 

Completeness (%) 

96.6 (95.4) 

99.6 (99.8) 

Anomalous completeness (%) 


99.4 (99.8) 

D 

lv merge 

0.056 (0.476) 

0.079 (0.263) 

<// ct/> 

14.3 (3.0) 

45.5 (19.7) 

Refinement 

Resolution range (A) 

26.1-1.5 


No. of reflections used in 

31,262 


working set 

No. of reflections for R free 

1,535 


calculation 

R (%) 

17.5 


Rfree (%) 

20.9 


No. of nonhydrogen atoms 

1,718 


No. of solvent waters 

207 


Mean B factor (A 2 ) 

23.3 


Ramachandran plot outliers 

0.0 


(%) b 

RMSD from ideal bond 

0.01 


length (A) c 

RMSD from ideal bond 

1.06 


angle (°) b 


a Values in parentheses refer to the outer resolution shell. Data collection statistics were 
calculated using Scala, part of the CCP4 software program suite (30, 31). Refinement 
statistics, except for Ramachandran outliers, were calculated using Buster-TNT (39). 
b Calculated using a strict-boundary Ramachandran plot definition (51). 
c Root mean square deviation; ideal values from Engh and Huber (52). 


Gln85 and Gly86, as well as Pro74, which makes a kink in the end 
of the strand (36. 

A small cavity is located at the top of the barrel between (35 and (37, 
next to the a-helix. The cavity is lined by residues Vall8, Prol9, 
Leu21, Val26, Glu26, Tyr41, Val61, Ile62, and Val89 and the aliphatic 
stem of Arg90. Glu29 has a different conformation in the A chain 
compared to that in the B chain, and the position of the side chain 
determines the size of the opening of the cavity to the solvent. 

The TGEV nspl sequence shares no significant similarity to 
any known structures in the Protein Data Bank. However, a search 
with the crystal structure in the PDBeFold database resulted in a 
single significant hit, the NMR structure of the N-terminal do¬ 
main of nspl from SARS-CoV (residues 13 to 128; PDB ID 2HSX/ 
2GDT), with a q score of 0.37 and a z score of 6.3. The q score 
indicates the quality of the alignment, where 1 corresponds to an 
identical hit. The z score measures statistical significance of the 
match, where a higher number corresponds to a higher statistical 
significance (43). 

To explore the relationship between the nspl proteins in 
a-CoV and (3-CoV B , sequences from both groups were gathered 
and aligned separately using the software program Clustal W (44) . 
A careful structure-based sequence alignment between TGEV 
nspl and SARS-CoV nspl 13-128 was used to merge the alignments 
of a-CoV and (3-CoV B . 

Conservation within the alphacoronavirus genus. The 

a-CoV alignment shows a number of highly conserved areas (Fig. 
2). A large portion of the conserved residues in a-CoV make up 
the hydrophobic core of the (3-barrel fold: these include Val44, 
Val52, Val61, Leu77, Leu84, Phe87, Ile88, and Val89. This cluster 
of residues is connected to a conserved solvent-exposed hydro- 
phobic patch consisting of Phe43 and PhelOO, via Gly86, which is 
highly conserved due to space restraints. The Ile23-Arg35 helix 
shows little conservation. However, it is anchored to the hydro- 
phobic core by the highly conserved residues Gly37 and Phe38. 

The TGEV nspl structure contains two salt bridges: Lys7- 
Asp99, connecting (31 to (38, and Lysl03-Asp71, connecting (38 to 
(36. Although Lys7-Asp99 also can be found in e.g. mink CoV in 
the a-CoV genus, none of these electrostatic interactions seem to 
be well retained throughout the CoV family. 

The surface of the TGEV nspl structure exhibits two highly 



FIG 1 (A) Overall structure of TGEV nspl in rainbow colors from blue in the N-terminal region to red in the C-terminal region. (B) Topology diagram with 
coloring corresponding to that in panel A. Strands (31, (33, (35, (36, (37, and (38 make up the barrel. Strands (32 and (34 are unique to TGEV nspl and are not found 
in SARS-CoV nspl 13-128 . 
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FIG 2 Eight nsp 1 sequences from a-CoV and four nsp 1 sequences from (3-CoV B were aligned separately. The two alignments were subsequently merged by using 
the three-dimensional structure alignment of TGEV nspl and SARS-CoV nspl 13-128 . The level of sequence conservation within each genus is highlighted with 
dark background color and white letters. A darker background color indicates a higher level of conservation. Residues conserved between the genera are marked 
with boxes. Residues likely to be important for a-CoV function are marked by stars. These residues are further highlighted in Fig. 3 and 4. The (3-CoV B consensus 
sequence suggested by Almeida et al. (28) is marked by circles. Secondary structure elements from the TGEV nspl structure are displayed above the sequence and 
colored according to the scheme in Fig. 1. The figure was prepared using the software program Aline (47). 


conserved areas. The first is located on a ridge formed by the loops 
between strands pl-p2 and p7-p8 together with strand P2 (Fig. 3 
and 4). On the ridge, Aspl3, Glnl5, Asn92, and Asn94, all con¬ 
served, are positioned in a ring around Tyrl4, which is not. Glnl5 
is consistently replaced by a Glu in the other a-CoVs. The second 
conserved patch is mainly made up from side chains from strand 
p8, where the highest level of conservation is found in the N-ter- 
minal part. The cluster also includes one residue from p3, Phe43. 
Together with PhelOO, this residue forms a small exposed hydro- 
phobic patch. In this conserved area, two more hydrophobic sur¬ 
face-exposed residues are found: Val96 and Leu97. 

Overall comparison between a-CoV and P-CoV B . A compar¬ 
ison of TGEV nspl with the structure of SARS-CoV nspl 13-128 
clearly shows that the two structures share the same fold, with a 
characteristic six-stranded p-barrel with a long alpha helix on one 
side of the barrel. However, a three-dimensional alignment of 
TGEV nspl with SARS-CoV nspl 13-128 reveals that there are large 
differences between the structures. The location of the strands in 
the barrel is shifted, along with an outward shift of the a-helix in 
TGEV nspl compared to SARS-CoV nspl 13-128 , where the helix is 


positioned closer to the barrel. The loop between p5 and P6 is 
significantly shorter in the TGEV nspl structure. In addition, the 
small p-sheet, comprising p2 and P4, flanking the barrel next to 
strands p3 and P7, is found only in the TGEV nspl structure. 

The small cavity with Glu29 as a gatekeeper is not conserved. 
Instead, there is a narrow tunnel in the SARS-CoV nspl 13-128 
structure, not found in TGEV nspl, that stretches through the 
center of the barrel. It appears too large to be an artifact from poor 
packing of the protein core. However, it is not likely to be con¬ 
served throughout the p-CoV B lineage, given the low conserva¬ 
tion of the neighboring side chains. 

Thus far, the viruses that belong to P-CoV B show lower diver¬ 
sity than those in a-CoV. An alignment of nspl from four viruses 
in P-CoV B , including SARS-CoV, shows three conserved areas. 
The mapping of these onto the SARS-CoV nspl 13-128 structure is 
illustrated in Fig. 4. The two separate alignments of a-CoV and 
p-CoV B were merged using the structure-based alignment of 
TGEV nspl and SARS-CoV nspl 13-128 (Fig. 2). Interestingly, the 
conserved regions within each group show very little overlap in 
the combined a-CoV and p-CoV B alignment. For example, the 
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FIG 3 The surface of TGEV nspl displays two areas with high sequence conservation. Two loops make up the first area (A), where Aspl3, Glnl5, Asn92, and 
Asn94 make up a conserved circle. The second area (B), centered on strand (38, displays both exposed hydrophobic residues and charged residues that potentially 
could interact with a partner molecule or partner protein. 




FIG 4 Superposed structures ofTGEV nspl and SARS-CoV nspl 13-128 are 
presented separately in two different rotations: rotation 1 (A to F) and 
rotation 2 (G to L). Conserved residues within each genus are shown in 
green, where darker green corresponds to a higher level of conservation. In 
each rotation, TGEV nspl andSARS-CoV nspl 13 “ 128 are shown as cartoons 
and surface representation displaying the conserved areas. Subfigures C, F, 
I, and L show the electrostatic surface potentials of the two structures from 
— 4/+4 mV. The electrostatic potential was calculated using the software 
programs APBS and PDB2PQR using the PARSE force field (48-50). The 
area within the dotted oval in subfigure H corresponds to the two con¬ 
served areas shown in Fig 3. 


(3-CoV B nspl proteins show a high level of conservation in helix 
a 1, absent in a-CoV. The conservation pattern in the barrel is also 
different between the two groups. The consensus sequence 
LRKxGxKG, referred to by Almeida et al. (28), is roughly con¬ 
served within (3 -CoV b . Compared with the a-CoV sequences, 
only the two glycines are conserved, both of which seem to be located 
in the linker region between nspl and nsp2. However, a few residues 
seem to be retained across the a-CoV genus and the (3-CoV B lineage. 
Most of these, like Ile8, Ile88, Phe43, Val44, Val52, Ile88, and Val89, 
are part of the conserved hydrophobic cluster, extending from the 
core of the barrel to the surface. Val44 is the center of a less well 
conserved cluster on the other side of (33. The (3-bulge located by 
Gly86 seems to be absolutely conserved throughout a-CoV and 
(3 -CoV b and might be a characteristic feature of the nspl (3-barrel 
fold. 

The poor sequence conservation between a-CoV and (3-CoV B 
is also reflected in the surface electrostatics. The open side of the 
TGEV nspl barrel exhibits a strong negative electrostatic poten¬ 
tial, whereas the long helix features mainly positive electrostatics 
(Fig. 4C and I). The SARS-CoV nspl 13-128 structure reveals a sig¬ 
nificantly different pattern. The electrostatic potential also seems 
to be slightly more conserved in (3-CoV B than in a-CoV (Fig. 4). 

DISCUSSION 

The high-resolution crystal structure ofTGEV nspl reveals that 
nspls from a-CoV share a fold with the N-terminal domain of the 
nspls in (3-CoV B , despite their lack of sequence similarity. At the 
same time, the structure also highlights that there are important 
structural differences between the two lineages, potentially ex¬ 
plaining their differences in function. SARS-CoV nspl inhibits 
interferon (IFN) expression in infected cells (19) and interferes 
with antiviral signaling pathways of the host (21). TGEV nspl, 
together with SARS-CoV nspl and several other CoV nspls, can 
also efficiently inhibit expression of host mRNA. However, little is 
known about the mechanism behind this function. 
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The structure of TGEV nspl is characterized by an irregular 
six-stranded (3-barrel flanked by an a-helix. In order to identify 
the evolutionarily conserved areas, an alignment was made from 
various nspl sequences from viruses in the alpha genus. The con¬ 
served residues were plotted onto the surface of the TGEV nspl 
protein. The conservation pattern within a-CoV does not give any 
immediate clues about the function or mechanism of a-CoV 
nspl. A large portion of the conserved residues, centered on the 
highly conserved strand (37, make up the core of the protein and 
are more likely to be involved in the structural stability of the 
protein than to be important for the function. However, the 
TGEV nspl surface features two highly conserved patches. From 
these, a few residues stand out as candidates for potential interac¬ 
tion with a partner or target molecule. The patch made up from 
the two loops between strands (31-(32 and (37-(38 together with 
strand (32 has four residues that are of special interest. These are 
Aspl3, which is completely conserved, Glnl5, which is a con¬ 
served Glu residue in all a-CoVs except TGEV, and two aspar¬ 
agines on the neighboring loop, Asn92 and Asn94. These con¬ 
served residues are all placed on a protruding, ridge formation. 
The highest conservation of the second patch is found mainly on 
the edge of the ridge and going down on one side (Fig. 3B), in¬ 
cluding residues Leu97, Glu98, and Asp99. Both of these patches 
are potential surfaces for interaction with another molecule. The 
protruding shape of the ridge, as opposed to a cavity or a bowl 
shape, suggests that the partner molecule maybe another protein. 
There are indications that TGEV nspl may need a host factor for 
its function. Experiments performed with cell-free HeLa extracts 
and rabbit reticulocyte lysate (RRL) reveal that TGEV nspl sup¬ 
presses protein translation in the first experiment but not the sec¬ 
ond, suggesting that a host factor that exists in the HeLa extracts 
but not in RRL is needed for TGEV nspl function (22). 

Intriguingly, the combined alignment of a-CoV and (3-CoV B 
nspls shows that there is not much overlap between the conser¬ 
vation patterns of the two groups. The lack of conservation is also 
reflected in the shape and the electrostatics of the TGEV and 
SARS-CoV nspl 13-128 structures, resulting in different three-di¬ 
mensional volumes despite the similar (3-barrel fold (Fig. 4). 

It has been previously speculated that the SARS-CoV nspl 
might be a unique SARS protein and that its ability to suppress 
host gene expression potentially could account for its elevated 
pathogenicity relative to that of other coronaviruses (16). It is now 
established that nspl from a-CoV, as well as that from (3-CoV A 
and (3 -CoV b , can induce suppression of host mRNA (18, 19, 20, 
22,23,45). It is also established that SARS-CoV nspl binds the 40S 
subunit of the ribosome to make it translationally inactive. The 
nspl-40S complex can modify the 5' end of capped mRNA and 
induce cleavage in certain mRNAs containing the internal ribo¬ 
some entry site (IRES). However, this activity cannot alone ac¬ 
count for the substantially reduced expression of the reporter pro¬ 
tein, suggesting that there is an additional mechanism for the 
suppression of host gene expression (23). 

In contrast to these results, although TGEV nspl has been 
shown to effectively suppress host gene expression, no binding to 
the 40S ribosomal subunit has been observed (22). TGEV nspl 
also failed to promote host mRNA degradation (22). In SARS- 
CoV nspl, it seems that the ability to bind to the 40S subunit is 
related to the second domain (residues 129 to 180), since SARS- 
CoV nspl carrying the K164A and H165A mutations was inactive 
in terms of 40S binding and consequently unable to degrade 


mRNA (23). However, nspl proteins from two other alphacoro- 
naviruses, HCoV-229E and HCoV-NL63, have been shown to im- 
munoprecipitate together with the S6 protein, which is part of the 
40S subunit (46). Interestingly, these two nspl proteins share all of 
the conserved regions in the a-CoV group (Fig. 2). Thus, it cannot 
be ruled out that a-CoV TGEV nspl might interact with parts of 
the 40S ribosomal subunit under certain conditions. 

It is tempting to speculate that the (3-barrel domain of SARS- 
CoV nsp 1 13-128 and TGEV nsp 1 share a similar mechanism for the 
additional suppression of host mRNA, not accounted for by the 
SARS-CoV nsp 1 -induced modification of mRNA. However, there 
is no experimental data to support this. On the contrary, the 
K164A HI65A double mutation harbored in the second domain 
renders SARS-CoV nspl completely inactive in experiments 
where TGEV nspl effectively suppressed host translation (22). 

The structural differences in TGEV nspl and SARS-CoV nspl, 
together with the available biochemical data, lead us to speculate 
that the nspl proteins from a-CoV and (3-CoV B have different 
mechanisms for 40S-independent suppression of host mRNA. 
However, since the TGEV nspl structure has the same fold as 
SARS-CoV nspl 13-128 , it is very unlikely that the nspls were ac¬ 
quired independently by the different genera. This suggests that 
the coronavirus nspls are evolutionarily related and that the dif¬ 
ferent mechanisms are a result of divergent evolution. 
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